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IMPROVED ESTIMATES EOR THE NUMBER OF PRIVILEGED 

WORDS 

JEREMY NICHOLSON AND NARAD RAMPERSAD 


Abstract. In combinatorics on words, a word w of length n over an alphabet of size q 
is said to be privileged if n < 1 or if n > 2 and w has a privileged border that occurs 
exactly twice in w. Forsyth, Jayakumar and Shallit proved that there exist at least 
privileged binary words of length n. Using the work of Guibas and Odlyzko, we prove that 
there are constants c and no such that for n > no, there are at least „pog privileged 
words of length n over an alphabet of size q. Thus, for n sufficiently large, we improve the 
earlier bound set by Forsyth, Jayakumar and Shallit and generalize for all q. 


1. Introduction 

The class of privileged words was recently introduced by Kellendonk, Lenz, and Savinien 
[3] and studied further by Peltomaki |1]. This class of words has been studied due to its 
connection with palindromes and the class of so-called “rich words”. A word w is privileged 
if either 

• w is a. single letter, or 

• w has a privileged border that occurs exactly twice in w. 

Recall that a border of a word w is a non-empty word that is both a prehx and a suffix 
of w. Forsyth, Jayakumar and Shallit [T] looked at the problem of enumerating privileged 
words over a binary alphabet. They proved that there exist at least privileged 

binary words of length n. In their paper they sketch a method for potentially improving 
this estimate. In the present paper we apply some results of Guibas and Odlyzko [2] on the 
size of prehx-synchronized codes to carry out this method. We are thus able to obtain the 
following asymptotic improvement to the result of Forsyth, Jayakumar, and Shallit, and also 
generalize the result to arbitrary alphabets. 

Theorem 1. There exist constants c and no such that for n > no, there are at least 
privileged words of length n over an alphabet of size q. 

The estimates given in this theorem derive from the asymptotic analysis of maximal prehx- 
synchronized codes carried out by Guibas and Odlyzko [2]. Given an alphabet of size q, 
a block length N and a prehx P of length p < N, a prefix-synchronized code is a set 
of length-A codewords with the property that every codeword starts with a hxed prehx 
P = 0102 • • • Op, and furthermore, for any codeword O1O2 • • • Op6i62 ■ ■ ■ b^-p, the prehx P does 
not appear as a factor of 02 • • • Op6i • • ■ bN-pdi ■ ■ ■ cbp-i- In other words, the border 01O2 • • • Op 
of 01O2 • • • ttpbi ■ ■ ■ 6Ar_p0i02 • • • Op occurs exactly twice in this word. 
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Next, we define G{N) = Gp{N) as the size of a maximal prefix-synchronized code with 
these parameters. In other words, Gp{N) is the number of g-ary words ai • • • a^+p such that 
Ofc • • • Ofc+p-i = P for k = 1 and k = N + 1, and for no other k with 1 < fc < iV -1- 1 . If we 
take n = N + p where the prehx P of length p is a privileged word, then Gp{N) counts all 
words of length n with a privileged border P that occurs exactly twice in w. All these words 
are necessarily privileged words. If we sum up Gp{N) over all privileged P of length p for 
some p, then we obtain a lower bound for the number of privileged words of length n. 


2. Proof of Theorem [U 

To prove Theorem [T] we begin with some lemmas. In all the following lemmas and for the 
rest of this document, q is the size of the alphabet and it will be a hxed integer > 2, p is 
the size of the prehx P, and Q = y\y2 • • • Pp is the autocorrelation of P, dehned as follows. 
li P = ai ■■■ Op, then for i = 1 ,..., p we dehne 


Vi = 

We also dehne the polynomial 


1 if ■ ■ • ai^ a\ • • • 

0 otherwise. 


/W = /q(^) = £ 


ViZ' 


p—l 


2=1 


The next series of lemmas are Lemmas 3-6 of [ 2 ] . 

Lemma 2. If p is sufficiently large, then 1-|- {z — q)f{z) has exactly one zero p that satisfies 

\p\ > 1-7. 

In what follows, the quantity p is the p specihed by the previous lemma. 


Lemma 3. If p is sufficiently large, then 

1 f{q) 


In p = In g 


qf{q) qf{qf 2g2/(g)' 




Dehne Rq by 


RqP 


{q-pfpP ^ 

1 - {q- pYffpy 


Lemma 4. G{N) = Rqp^ + 0{{1.7)^) 

Lemma 5. If p is sufficiently large, then 

In Rq = {p- 2) 1,15-2 ln(/(5)) + ^ ^ ^ ' 

In what follows, c’s, d’s and Greek letters denote positive constants. 


Lemma 6. Let p be the unique integer such that 


Ing 
g - 1 


qP <N < 


In g 
g- 1 


q 


p+i 
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Let P be a prefix of length p and let n = N + p. There exist constants Nq and d such that 
for N > No we have 

Gp{N) > dq^/n^. 

(The constant d may depend on q but not on N.) 


Proof. Vl p = [logq N + logg(g — 1) — logg(lng)J, then G{N) = Rqp^ by applying Lemma HI 
By Lemmas [3] and [5] we have 


ln(i?Qp") = lni?Q + A^lnp 


= (p-2)lng-21n(/(g)) + 

N 


Sf'iq) p - 2 


+ N\nq — 


f{q)‘^ qf{q) 
Nf'iq) N 


qf{q) qfiqV 2 g 2 /(?)^ 


+ 0 


+ 0 

Np'^ 


p 


2p 


q 


3p 


Therefore, 


G{N) 


RqP^ = exp{\n{RQp^)) 

/ 3f(g) _ p^ _ _ Nf'jq) _ N ^ 

f{qy ^^"^Xfiqy qf{q) qf{q) qfiq)^ ‘^q^fiqf \q^^ q^^Jj' 


Since the hrst digit of Q will always be a 1, /(g) will have the leading term q^ Let 
a,/3,'y,d, and Nq be positive constants such that the following inequalities are valid for all 
N > Nq. 


Thus 

Gp{N) = 


> 

> 


r. 3/'(g) (p-l)2gP-2 3 (p-1)2 

0 < < 3-lL—_T_2- < - P < 2 


fiq) 


2 — 


q 


2p—2 


qp 


^ ^ < p < 1/, 

g/(g) qP qP 


N N 
< — < 


N 


< a- 


N 


-<a^<B 

qf{q) ~ qP ~ gllogg^-log?!'"'?)! ~ '^glogq-^-logqilng) — N 


Nf'iq) ^ iV(p-l)V-^ ^ N{p-l)^ ^ N {p-l)^ ^ 2/3 


qf{q)^ 

N 


q 


3p-2 


q 


2p 


N 

< -< 


N 


qp qp 

6N 6N ^ 

< ——TT < — < 5 


2g2/(g)2 “ 2g2p “ 2g^l-iog5A^-iog5(ing)J - gSiog^vr - j^2 

< 7- 


Q I N]^ ^ 


q3p q2p 



o(P] 


\q2pj 


q 


Nfi-p—2 


f{q)^ 

^_^qN+p-2 


exp 

> 


3f (g) p - 2 


N 


Nf'iq) 


N 


fiq)^ qfiq) qfiq) qfiq)^ 2g2/(g)2 

d.^qN+p-2 ^^^n+p-2 ^^^N+p-2 


+ 0 


/(g)2 ((1 — gP)/(l — g))2 (1 — gP)2 g2p — 2gP + 1 

dsg^ 


d^^qN+p 2 ^ d2q^^P ^ ^ dsq^ ^ 


g2p + 1 


2g2p 


qp 


q 


[log AT+log (q-l)-log (In(j)J 


_/Vp2 p2 


^3p 


+ 


y2p 



















































> 


> 


dsq 


N 


^ d^q^lnq ^ d^q^ ^ ^ 4 ^”' ^ 


glogg iV+logq(q-l)-logg{lng) — _/V(g — 1) “ N 

d^qn-{iog,N+iog^{q-i)-iog^{inq)) lu q dq 


n 


n 


nN{q — 1) 77,2 


□ 


We can now complete the proof of Theorem [H 

Proof of Theorem\^ We define the function B{n,q) as the number of privileged words of 
length n over an alphabet of size q >2. Let n = N+p where p = [log^ N + logg(g — 1) — log^(ln g)J. 
Let no be a constant such that whenever n > uq we have p > Nq, where Nq is the constant 
mentioned in Lemma |6l Then for n > no, we have 

B{n,q) > Y. 

P privileged 
\p\=p 


> 


> 


> 


> 


> 


> 


> 


> 


> 


> 


dq'^ 
rP 
dq'^ 
rP 

cpf_ 

rp 
C2q' 
rP 
c^q' 
rP 

rP J \^(log giV)2 


B{l\oggN + \ogg{q-l) - logq(lng)J,g) 
q U°g 9 ''V+log^ (q-1) -logg (In q)J 
(LlogqiV + logg(g- 1) - logq(lng)J)2 

glog^ V+log^(q-l)-logg{lnq) 

(logg N + logg(g - 1) - logg(ln q)Y 
Njq - 1) 

(lng)(l + logg N) 

N 

{\Oggq + \Ogg iV )2 

N 


csq 

rP 

Csq" 

rP 


p 


{\ogq qny {\ogg qnf 
n logg n 


(log gn)2 (log gn)2 (log gn)^ 


Cog" 


f (logq 


n 


(log„ n) 


(log„n)^ 


n(log„n)2y \(log„gn)2 n(log„gn)2 n(log„gn)2 


eg 


n(log n)2 ’ 


since 


(logq nf 


(log^n) 


(logq nf 


(logqgn)2 n(logqgn)2 n(logggn)2 

is positive and increases for n > 2. This completes the proof. 


□ 
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